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Abstract — We consider the problem of data exchange 
by a group of closely-located wireless nodes. In this 
problem each node holds a set of packets and needs to 
obtain all the packets held by other nodes. Each of the 
nodes can broadcast the packets in its possession (or a 
combination thereof) via a noiseless broadcast channel 
of capacity one packet per channel use. The goal is to 
minimize the total number of transmissions needed to 
satisfy the demands of all the nodes, assuming that they 
can cooperate with each other and are fully aware of 
the packet sets available to other nodes. This problem 
arises in several practical settings, such as peer-to-peer 
systems and wireless data broadcast. In this paper, we 
establish upper and lower bounds on the optimal number 
of transmissions and present an efficient algorithm with 
provable performance guarantees. The effectiveness of our 
algorithms is established through numerical simulations. 



I. Introduction 

In recent years there has been a growing interest in 
developing cooperative strategies for wireless communi- 
cations IT), (2). Cooperative communication is a promis- 
ing technology for the future that can provide distributed 
space-time diversity, energy efficiency, increased cover- 
age, and enhanced data rates. In a cooperative setting 
users aid each other to achieve a common goal sooner, 
or in a more robust or energy efficient manner. This is in 
contrast to a traditional non-cooperative scenario where 
users compete against each other for channel resources 
(time, bandwidth, etc). 

In this paper we consider the problem of cooperative 
data exchange. To motivate the problem, consider a 
group of mobile users or clients who wish to download 
a large file, which is divided into n packets, from a 
base station. The common goal here is to minimize 
the total download time. The long-range link from each 
client to the base station is subject to long-term path 
loss and shadowing, as well as short-term fading (3), 
which often render it unreliable and slow. As a result, 
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after n transmissions from the base station, each client 
may have received only a subset of degrees of freedom 
required for full decoding. A possible strategy for the 
base station is to employ network coding and keep 
transmitting innovative packets until every client has 
fully decoded all packets (4), (5). 

If the clients happen to be in the vicinity of each other 
an alternative strategy for the clients is to switch to short- 
range transmission as soon as all packets are collectively 
owned by the group. This strategy has two main benefits: 

• short-range communications is often much more 
reliable and faster; 

• after only n transmissions, the valuable long-range 
channel is freed and the base station can serve other 
clients in the system. 

To illustrate the problem further, consider four wire- 
less clients who had requested n — 4 packets, 
xi,...,X4 G GF(2 m ), from the base station. However, 
due to channel imperfections, the first client only re- 
ceived packet xi, while second, third, and fourth clients 
received packets {£2,^4}, {^2,^3}, and {xi^xs}, re- 
spectively. Since they have collectively received all the 
packets, they can now try to communicate among them- 
selves to complete the communication and ensure that 
all the clients eventually possess all the packets. 

In this paper we make the following assumptions: 
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Fig. 1. Data exchange among four clients. 



1) each mobile client can broadcast data to all other 
clients at a rate of one packet per transmission, 
i.e., m bits per transmission in this example. Fur- 
thermore, all the clients receive this transmission 
error- free; 

2) each client knows the packets that were received 
by others. 

The first assumption is justified when mobile users are 
close to each other. The second condition can be easily 
achieved at the beginning by broadcasting the index of 
the received packets to the other clients. 

A naive solution to this problem would require four 
transmissions: the first client sends xi, the second client 
sends X2 and then x±, and finally the third client sends 
£3. But, the question that we are interested in here is "can 
the clients do better in terms of number of transmissions 
and if so, how?" For the above example, it is easy 
to see that with coding, we can reduce the number of 
transmissions from 4 to 3. Figure [T] shows a coding 
scheme with 3 transmissions where the second, third, and 
fourth clients send the coded packets £2+^3 and 

x\ +£3, respectively. It can be verified that all the mobile 
clients can then decode all the packets. In fact, this is the 
minimum number of transmissions since the first client 
had initially received one packet hence it needs to receive 
at least three degrees of freedom from the other clients. 

This problem is related to the index coding problem 
©, Q, (8), (9), (10) in which the different clients cannot 
communicate with each other, but can receive transmis- 
sions from a server possessing all the data. Moreover, 
in the index coding problem different clients might have 
different demands, while in the problem considered here 
each client wants to obtain all the available packets. 
Another related line of work is that of gossip algorithms 
(TTJ where the goal is to efficiently and distributively 
compute a function of the data present in a dynamic 
network. 

Contribution. In this paper, we develop a framework 
towards finding optimum strategies for cooperative data 
exchange. To the best of our knowledge, this problem 
setting has not been previously considered in the litera- 
ture. We establish upper and lower bounds on the optimal 
number of transmissions. We also present an efficient 
algorithm with a provable performance guarantee. The 
effectiveness of our approach is verified through numer- 
ical simulations. 

Organization. The rest of this paper is organized 
as follows. In Section [n] we present the model and 
formally define the problem. In Section III we establish 
lower and upper bounds on the required number of 



algorithm and analyze its performance. Our evaluation 
results are presented in Section [Vl] Finally, conclusions 



and directions for future work appear in Section VII 



transmissions. In Section IV we present a deterministic 



II. Model 

The problem can be formally defined as follows. A 
set X = {^i, . . . , x n } of n packets each belonging to 
a finite alphabet A needs to be delivered to a set of k 
clients C = {ci, . . . , c/~}. Each client q G C initially 
holds a subset J*Q of packets in X, i.e., X{ C X. We 
denote by = \Xi\ the number of packets initially 
available to client q, and by = X \ the set 
of packets required by q. We assume that the clients 
collectively know all packets in X, i.e., U Cie c^i = X. 
Each client can communicate to all its peers through 
an error-free broadcast channel of capacity one packet 
per channel use. The problem is to find a scheme that 
requires the minimum number of transmissions to make 
all the packets available to all the clients. 

We focus in this paper on the design of linear solutions 
to the problem at hand. In a linear solution, each packet 
is considered to be an element of a finite field F and 
all the encoding operations are linear over this field. 
For a given instance of this problem, define r to be 
the minimum total number of transmissions required 
to satisfy the demands of all the receivers with linear 
coding. 

We denote by n m i n the minimum number of packets 
held by a client, i.e., n min = mini<i<fc th, and by n max 
the maximum number of packets held be a client, i.e., 
n max = maxKKfe m. 

We say that a client q has a unique packet Xj if 
Xj G Xi and Xj £ X^ for all t ^ i. Note that, without 
loss of generality, we can assume that no client has a 
unique packet. Indeed, a unique packet can be broadcast 
by the client holding it in an uncoded form without any 
penalty in terms of optimality. 

III. Upper and Lower Bounds 

We begin by establishing a lower bound on the number 
of transmissions. 

Lemma 1: The minimum number of transmissions is 
greater or equal to n — n min . If all clients initially 
have the same number of packets n m i n < n, i.e., 
n>i = nmin for i = 1, . . . , k, then the minimum number 
of transmissions is greater or equal n — n m i n + 1. 

Proof: The first part follows from the fact that each 
client needs to receive at least n—rii packets. The second 
part follows from the fact that a transmitting client does 
not benefit from its own transmissions. ■ 

Next, we present an upper bound on the minimum 
required number of transmissions r. 
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a linear combination of the packets in X. For a coded 
packet x we denote by T x G F n the corresponding vector 
of linear coefficients, i.e., x = T x • (xi, . . . , x n ) T . 

We also denote by Yi the subspace spanned by vectors 
corresponding to the linear combinations available at 
client q. In the beginning of the algorithm, Yi is equal 
to the subspace spanned by vectors that correspond to 
the packets in Xi, i.e., Yi = ({T x | x G Xi}). The 
goal of our algorithm is to simultaneously increase the 
dimension of the subspaces Y^i = 1, . . . , k, for as many 
clients as possible. 

Specifically, at each iteration, the algorithm identifies 
a client G C whose subspace Yi is of maximum 
dimension. Then, client selects a vector b G Yi 
in a way that increases the dimension of Yj for each 
client Cj ^ q, and transmits the corresponding packet 
b • (#1, . . . , x n ) T . Vector b must satisfy b ^ Yj for 
all j ^ i. Such a vector 6 exists and can be selected 
using the network coding techniques [12] provided that 
|F| >k. 

At any iteration, the subspace associated with a certain 
client will correspond to the original packets possessed 
by this client, in addition to the transmitted packets in 
the previous transmissions. As a result, at some iteration, 
the subspaces associated with a number of clients may 
become identical. In this case, without loss of generality, 
we merge this group of clients into a single client with 
the same subspace. 

The formal description of the algorithm is presented 
below. 



Lemma 2: For |F| > k, it holds that 

r< min{|X i | + max {XjClXA}. (1) 

l<i<k !<3<k 

Proof: Consider the following solution consisting 
of two phases: 

1) Phase 1: pick a client q and make it a "leader" 
by satisfying all of its demands with uncoded 
transmissions from the other clients. This requires 
\Xi\ transmissions. 

2) Phase 2: client q broadcasts coded packets to 
satisfy the demands of all the other clients. 

After Phase 1, each client Cj knows all the packets in 
Xj U Xi and, thus, requires packets Xj flXj. By using 
network coding techniques (see e.g., fT2|), Phase 2 can 
be accomplished using maxj \Xj fl Xi\ transmissions, 
provided that the size of the finite field F is at least k. 
Indeed, the leader q can form encoded packets in such a 
way that, after each transmission, the degree of freedom 
is increased by one for every client j ^ i who still have 
not received all the packets. ■ 

The bound of Lemma [2] is tight in many instances, 
in particular when all the sets Xi are disjoint (i.e., 
Xi D Xj ■, = 0, i 7^ j). In this trivial case, the minimum 
number of transmissions is equal to n, and this bound 
is tight. The following is a non-trivial example where 
the above bound is also tight: X\ = {#2,^3}, X 2 = 
{xi^xs} and X 3 = {xi : x 2 }. In this case, the upper 
bound of Lemma gives r < 2. But, lemma [T] gives 
r > n — rimin + 1 = 2. Therefore r = 2, and one 
way this can be achieved is by letting c\ and c 2 transmit 
X2 + xs and x\ + £3, respectively. 

The bound of Lemma [2] is not always tight since 
the scheme described in the proof is not guaranteed 
to be optimal. This is due to the fact that a client is 
made a leader by only transmitting uncoded messages. 
Consider, for example, the following instance with four 
clients and where X\ = {£2,^3,^4}, X 2 = {^1,^4}, 
Xs = {xi, X2,X4} and X4 = {xi,xs}. From the 
previous lemma, we get r < 3. However, there exists 
a solution with two transmissions where c\ transmits 
xs + £4 and cs transmits x\ + x 2 + x±. By Lemma [T] 
we know that this scheme is optimal, and r = 2. 

In the next section, we present an additional bound on 
r. In particular, Lemma [3] shows that 

IV. A Deterministic Algorithm 

We proceed to present an efficient deterministic al- 
gorithm for the information exchange problem. At each 
iteration of the algorithm, one of the clients broadcasts 



Algorithm IE (Information Exchange) 



1 for i ^— 1 to k 

2 do 

3 Y % = ({T x I x G Xi}) 

4 while there is a client i with diml^ < n 

5 do 

6 while 3ci, Cj G C i ^ j, such that Yi = Yj 

7 do 

8 C = C\{a} 

9 Find a client q with a subspace Yi of 
maximum dimension (If there are multiple 
such clients choose an arbitrary one of them) 

10 Select a vector b G Yi such that b £ Yj 
for each j ^ i 

11 Let client q broadcast packet b • (xi, . . . , x n ) T 

12 for i = 1 <- 1 to k 

13 do 

14 Yi <- Yi + ({b}} 



Lemma 3: The number of transmissions made by 
Algorithm IE is at most min{n, 2n — n max — n m i n }, 
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provided that |F| > k. 

Proof: First, we show that in Step [10| of the algo- 
rithm it is always possible to select a vector b in Y{ such 
that b £ Yj for each j ^ i. We note that the algorithm 
maintains the invariant Yi ^ Yj for i ^ j. Since in Step [9] 
of the algorithm we select a client q with a maximum 
dimension of Yi, we can then use an argument similar 
to that used in |T2) to show that there exists a vector 
b G Yi such that b £ Yj for j ^ i. 

Note also that once Yi = Yj for two different clients q 
and Cj , they will be identical for the rest of the algorithm. 
Thus, we can remove one of the clients at Step [8] as 
described previously. 

We proceed to analyze the number of transmissions 
made by ALGORITHM IE. We note that each transmis- 
sion is linearly independent of the others. Therefore, the 
total number of transmission is bounded by n. 

Let Ci be a client with \Xi\ = n max . Note that n — 
Timax transmissions are needed in order to satisfy the 
demands of q. We consider two cases: 

First, suppose that dim 3^ ^ n until the last iteration 
of the main loop (that begins on Step 4). Then, let 
Cj be a client that has transmitted a packet at the last 
iteration. Note that at the beginning of the last iteration 
Yi C Yj since Cj should know all the packets to be 
able to transmit in the last iteration. Therefore, q and Cj 
will only merge upon the completion of the algorithm. 
In terms of the number of transmissions, the worst 
case scenario occurs when in each transmission round, 
the dimensionality of either Yi or Yj , but not both, 
increases by one. Since \Xj\ > n m i n , the total number 
of transmissions is at most n — n max + n — n m i n = 

Second, suppose that the dimension of Yi is equal to 
n before the last iteration of the main loop. In this case, 
we select Cj to be the client for which \Yj\ 7^ n until 
the last iteration. Using a similar argument as in the first 
case, we can show that the number of transmissions is 
at most 2n - n max - n min . ■ 

Corollary 4: The number of transmissions made by 
Algorithm IE is at most two times more than the 
optimal algorithm. 

Proof: By Lemma [3] the number of transmission 
made by ALGORITHM IE is at most 2n — n max — n min . 
By Lemma [T] the optimum number number of transmis- 
sion is at least n — n m i n . Since n max > n m i n it holds the 
number of transmission is at most twice the optimum. 



Our empirical results, presented in Section VI show 
that Algorithm IE performs very well in practical 
settings. 



V. Relations to Rank Optimization Problems 

In this section, we show that r can be obtained by 
solving a rank optimization problem. 

Letra' = |Xi| + |X 2 | + --- + |Xfc| = m+n 2 + - • -+n k . 
We define Mf (n^,n) to be the set of rii x n matrices 
with entries in the field F. To each client q, we associate 
the set of matrices 

A F := {[a hl ] G M ¥ (m,n)\a h i = if x x £ X h 

t = i,.. . ,m}. 

We also define the set of matrices 

Mil 



A F := {A e M w (ri,n)\A 



where Ai e A F }. 



For example, the matrices Ai G for the instance 
n Fig 

A x = [* 0],A 2 = 



depicted in Figure [T] have the following form: 

"0*0 







* 

* 



* 

* 



where the "*" is the "don't care" symbol, each entry 
with this symbol can independently take any value in 
the field F. 

Let 61,62, ... ,e n be the canonical basis of the vector 
space F n , i.e. the coordinates of e$ are all zeros except 
the i-th coordinate which is equal to 1. To each client q, 
we also associate the matrix B { G M F (n^ n) whose row 
vectors are vectors ej in the canonical basis satisfying 
Xj G Xi. 

Going back to the instance depicted in Figure [T] we 
have: 



B 1 = [1 0],B 2 



"0 


1 





0" 


, B\ = 


"1 








0" 








1 


0_ 








1 






£3 



The following theorem is easy to establish: 
Theorem 5: The minimum number of transmissions 

r achieved by linear codes is given by the following 

optimization problem: 



r = min rank (A) 

AeA ¥ 



subject to: 



rank 



n, Vi = 1, . . . , k. 
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-Lower Bound-Lemma 1 
-Deterministic ALGORITHM IE 
-Uper Bound-Lemma 2 
-Trivial bound of n transmissions 
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Number of Packets, n 



Fig. 2. Numerical results with three clients using Algorithm IE 
and its comparison with the upper and lower bounds of Section [ill] 



VI. Numerical Results 

In this section, we evaluate the lower and upper 
bounds on the optimum number of transmissions r. We 
also verify the performance of the algorithm presented 
in the previous section. 

Figure [2] shows numerical results for k = 3 client^] 
The number of packets ranges from n — 10 to n — 
50. Each curve (except for the top one) represents the 
average over 100 random initializations of the problem 
(we randomly selected J*Q subject to UJ*Q = X). The 
bottom curve shows the lower bound of Lemma [T] The 
next curve is average number of transmissions required 
by Algorithm IE. Remarkably, the algorithm performs 
very close to lower bound. The next curve shows the 



upper bound of Lemma [lTl| Finally, the top curve shows 
the trivial upper bound of n transmissions. 

Finally, we have observed similar trends for a larger 
number of clients k and have omitted the numerical 
results for brevity. 

VII. Conclusion 

In this paper, we considered the problem of coop- 
erative data exchange by a group of wireless clients. 
Our figure of merit was the number of transmissions to 
ensure that each client eventually obtains all the data. 
We have established upper and lower bounds on the 
optimum number of transmissions. We also presented a 
deterministic algorithm, referred to as ALGORITHM IE, 
for this problem and analyzed its performance. Empirical 
results pointed that the proposed algorithm performs 
remarkably close to the lower bound. 

! Note that in our numerical analysis, we have taken into account 
the number of transmissions for broadcasting unique packets. Since 
this number is the same for all schemes, the relative performance is 
unaffected. 



This work was only a first step towards understanding 
the problem and there are many interesting directions 
for future research. We still do not know the optimal 
solution or the computational complexity of finding one. 
Furthermore, in analogy with the index coding problem, 
an interesting open question here is whether linear codes 
are always optimal, and whether there is any advantage 
to splitting packets before linearly encoding them, i.e. 
whether vector linear codes can lead to a lower number 
of transmissions than scalar linear ones. 

Another important issue is to ensure "fairness" for 
all clients. In this paper we were not concerned with 
the number of transmissions each clients makes. In 
practice, clients have limited energy resources and hence, 
it makes sense to find solutions where the number of 
transmissions is as uniformly distributed among different 
clients as possible. 



References 



[i] 



G. Kramer, R. Berry, A. E. Gamal, H. E. Gamal, 
M. Franceschetti, M. Gastpar, and J. N. Laneman, "Introduction 
to the special issue on models, theory, and codes for relaying 
and cooperation in communication networks," IEEE Trans. 
Inform. Theory, vol. 53, no. 10, pp. 3297 - 3301, Oct. 2007. 
[2] A. Sendonaris, E. Erkip, and B. Aazhang, "User cooperation 
diversity-Part I: System description," IEEE Trans. Commun., 
vol. 51, no. 11, pp. 1927-1938, Nov. 2003. 
[3] T. S. Rappaport, Wireless Communications, Principles and Prac- 
tice, 2nd ed. Upper Saddle River: Prentice Hall, 2002. 
[4] P. Sadeghi, D. Traskov, and R. Koetter, "Adaptive network 
coding for broadcast channels," in Proc. 2009 Workshop on 
Network Coding, Theory, and Applications (NetCod), Lausanne, 
Switzerland, June 2009, pp. 80-85. 
[5] J. K. Sundararajan, D. Shah, and M. Medard, "ARQ for network 
coding," in Proc. IEEE Int. Symp. on Inform. Theory (ISIT), 
Toronto, Canada, July 2008, pp. 1651-1655. 
[6] Y. Birk and T. Kol, "Coding-on-demand by an informed source 
(ISCOD) for efficient broadcast of different supplemental data 
to caching clients," IEEE Transactions on Infromation Theory, 
vol. 52, no. 6, pp. 2825-2830, June 2006. 
[7] Z. Bar-Yossef, Y. Birk, T. S. Jayram, and T. Ko, "Index coding 
with side information," in Proc. of the 47th Annual IEEE Sym- 
posium on Foundations of Computer Science (FOCS), 2006, pp. 
197-206. 

[8] S. E. Rouayheb, A. Sprintson, and C. N. Georghiades, "On the 
relation between the index coding and the network coding prob- 
lems," Proc. of IEEE International Symposium on Information 
Theory (ISIT08), 2008. 

-, "A new construction method for networks from matroids," 
in Proceedings of the IEEE International Symposium on Infor- 
mation Theory (ISIT), Seoul, Korea, June 2009. 

-, "On the index coding problem and its relation to network 
coding and matroid theory," 2009, submitted to IEEE Transac- 
tions on Information Theory. 
[11] D. Shah, Gossip Algorithms (Foundations and Trends in Net- 
working). Now Publishers Inc, 2007, vol. 3, no. 1. 
[12] S. Jaggi, P. Sanders, P. A. Chou, M. Effros, S. Egner, K. Jain, 
and L. Tolhuizen, "Polynomial Time Algorithms for Multicast 
Network Code Construction," IEEE Transactions on Information 
Theory, vol. 51, no. 6, pp. 1973-1982, 2005. 



[9] 



[10] 



5 



